54 research outputs found
Towards Distributed and Adaptive Detection and Localisation of Network Faults
We present a statistical probing-approach to distributed fault-detection in networked systems, based on autonomous configuration of algorithm parameters. Statistical modelling is used for detection and localisation of network faults. A detected fault is isolated to a node or link by collaborative fault-localisation. From local measurements obtained through probing between nodes, probe response delay and packet drop are modelled via parameter estimation for each link. Estimated model parameters are used for autonomous configuration of algorithm parameters, related to probe intervals and detection mechanisms. Expected fault-detection performance is formulated as a cost instead of specific parameter values, significantly reducing configuration efforts in a distributed system. The benefit offered by using our algorithm is fault-detection with increased certainty based on local measurements, compared to other methods not taking observed network conditions into account. We investigate the algorithm performance for varying user parameters and failure conditions. The simulation results indicate that more than 95 % of the generated faults can be detected with few false alarms. At least 80 % of the link faults and 65 % of the node faults are correctly localised. The performance can be improved by parameter adjustments and by using alternative paths for communication of algorithm control messages
An initial approach to distributed adaptive fault-handling in networked systems
We present a distributed adaptive fault-handling algorithm applied in networked systems. The probabilistic approach that we use makes the proposed method capable of adaptively detect and localize network faults by the use of simple end-to-end test transactions. Our method operates in a fully distributed manner, such that each network element detects faults using locally extracted information as input. This allows for a fast autonomous adaption to local network conditions in real-time, with significantly reduced need for manual configuration of algorithm parameters. Initial results from a small synthetically generated network indicate that satisfactory algorithm performance can be achieved, with respect to the number of detected and localized faults, detection time and false alarm rate
Long-term adaptation and distributed detection of local network changes
We present a statistical approach to distributed detection of local latency shifts in networked systems. For this purpose, response delay measurements are performed between neighbouring nodes via probing. The expected probe response delay on each connection is statistically modelled via parameter estimation. Adaptation to drifting delays is accounted for by the use of overlapping models, such that previous models are partially used as input to future models. Based on the symmetric Kullback-Leibler divergence metric, latency shifts can be detected by comparing the estimated parameters of the current and previous models. In order to reduce the number of detection alarms, thresholds for divergence and convergence are used.
The method that we propose can be applied to many types of statistical distributions, and requires only constant memory compared to e.g., sliding window techniques and decay functions. Therefore, the method is applicable in various kinds of network equipment with limited capacity, such as sensor networks, mobile ad hoc networks etc. We have investigated the behaviour of the method for different model parameters. Further, we have tested the detection performance in network simulations, for both gradual and abrupt shifts in the probe response delay. The results indicate that over 90% of the shifts can be detected. Undetected shifts are mainly the effects of long convergence processes triggered by previous shifts. The overall performance depends on the characteristics of the shifts and the configuration of the model parameters
On Practical machine Learning and Data Analysis
This thesis discusses and addresses some of the difficulties
associated with practical machine learning and data
analysis. Introducing data driven methods in e.g industrial and
business applications can lead to large gains in productivity and
efficiency, but the cost and complexity are often
overwhelming. Creating machine learning applications in practise often
involves a large amount of manual labour, which often needs to be
performed by an experienced analyst without significant experience
with the application area. We will here discuss some of the hurdles
faced in a typical analysis project and suggest measures and methods
to simplify the process.
One of the most important issues when applying machine learning
methods to complex data, such as e.g. industrial applications, is that
the processes generating the data are modelled in an appropriate
way. Relevant aspects have to be formalised and represented in a way
that allow us to perform our calculations in an efficient manner. We
present a statistical modelling framework, Hierarchical Graph
Mixtures, based on a combination of graphical models and mixture
models. It allows us to create consistent, expressive statistical
models that simplify the modelling of complex systems. Using a
Bayesian approach, we allow for encoding of prior knowledge and make
the models applicable in situations when relatively little data are
available.
Detecting structures in data, such as clusters and dependency
structure, is very important both for understanding an application
area and for specifying the structure of e.g. a hierarchical graph
mixture. We will discuss how this structure can be extracted for
sequential data. By using the inherent dependency structure of
sequential data we construct an information theoretical measure of
correlation that does not suffer from the problems most common
correlation measures have with this type of data.
In many diagnosis situations it is desirable to perform a
classification in an iterative and interactive manner. The matter is
often complicated by very limited amounts of knowledge and examples
when a new system to be diagnosed is initially brought into use. We
describe how to create an incremental classification system based on a
statistical model that is trained from empirical data, and show how
the limited available background information can still be used
initially for a functioning diagnosis system.
To minimise the effort with which results are achieved within data
analysis projects, we need to address not only the models used, but
also the methodology and applications that can help simplify the
process. We present a methodology for data preparation and a software
library intended for rapid analysis, prototyping, and deployment.
Finally, we will study a few example applications, presenting tasks
within classification, prediction and anomaly detection. The examples
include demand prediction for supply chain management, approximating
complex simulators for increased speed in parameter optimisation, and
fraud detection and classification within a media-on-demand system
Autonomous Accident Monitoring Using Cellular Network Data
Mobile communication networks constitute large-scale sensor networks that generate huge amounts of data that can be refined into collective mobility patterns. In this paper we propose a method for using these patterns to autonomously monitor and detect accidents and other critical events. The approach is to identify a measure that is approximately time-invariant on short time-scales under regular conditions, estimate the short and long-term dynamics of this measure using Bayesian inference, and identify sudden shifts in mobility patterns by monitoring the divergence between the short and long-term estimates. By estimating long-term dynamics, the method is also able to adapt to long-term trends in data. As a proof-of-concept, we apply this approach in a vehicular traffic scenario, where we demonstrate that the method can detect traffic accidents and distinguish these from regular events, such as traffic congestions
Tracking user terminals in a mobile communication network
There is provided a method of tracking user terminals in a mobile communication network. The method comprising, at a tracking node, determining that a user terminal is located in a tracking area, storing data associated with the tracking area, the data comprising a number of observations of all user terminals at the tracking area at a first time, receiving a page response from the user terminal located in one of the tracking area and a further tracking area, and in the event that the user terminal remains located at the tracking area, updating the data to include the number of page responses received at the tracking area after a first time interval, and in the event that the user terminal is located at the further tracking area, updating the data to include the number of page responses received at the further tracking area after the first time interval
Learning machines for computational epidemiology
Resting on our experience of computational epidemiology
in practice and of industrial projects on analytics of
complex networks, we point to an innovation opportunity for
improving the digital services to epidemiologists for monitoring,
modeling, and mitigating the effects of communicable disease.
Artificial intelligence and intelligent analytics of syndromic
surveillance data promise new insights to epidemiologists, but
the real value can only be realized if human assessments are
paired with assessments made by machines. Neither massive
data itself, nor careful analytics will necessarily lead to better
informed decisions. The process producing feedback to humans
on decision making informed by machines can be reversed to
consider feedback to machines on decision making informed by
humans, enabling learning machines. We predict and argue for
the fact that the sensemaking that such machines can perform in
tandem with humans can be of immense value to epidemiologists
in the future
Finding dependencies in industrial process data
Dependency derivation and the creation of dependency graphs are critical tasks for increasing the understanding of an industrial process. However, the most commonly used correlation measures are often not appropriate to find correlations between time series. We present a measure that solves some of these problems
- β¦